Semantic role labeling
Semantic role labeling (SRL) is a task in natural language processing consisting of the detection of the semantic arguments associated with the predicate or verb of a sentence and their classification into their specific roles.Wikipedia contributors, "Semantic role labeling," Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Semantic_role_labeling&oldid=589516830 (accessed September 10, 2014). It is considered a shallow semantic parsing task. A successful execution of SRL tranform a sentence into a set of propositions. One good way to start is to ask ourselves what kinds of propositions there are and what set of propositions are enough to transcribe human languages. Unfortunately, there isn't a definite answer for those questions although there are some candidates such as case theory and semantic frame . Most researches local identification and classification followed by global inference however integrated and incremental approaches have been developed. Theories Two broad families exist: # Syntax-based approach : explaining the varied expression of verb arguments within syntactic positions : Levin (1993) verb classes =⇒ VerbNet (Kipper et al., 2000) =⇒ PropBank (Palmer et al., 2005): Focused on verbs ( lately nominal bank (NomBank) is used together with PropBank in many semantic tasks ) # Situation-based approach (a word activates/invokes a frame of semantic knowledge that relates linguistic semantics to encyclopedic knowledge) : Frame semantics (Fillmore, 1976) =⇒ FrameNet (Fillmore et al., 2004): Words with other POS can invoke frames too (e.g., nouns, adjectives) Local approaches Local stage in Björkelund (2009)Björkelund, A., Hafdell, L., & Nugues, P. (2009). Multilingual Semantic Role Labeling. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task (pp. 43–48). Association for Computational Linguistics. Also uses the term "local": Toutanova et al. 2005 According to Zapirain et al. (2013)Zapirain, B., Agirre, E., Màrquez, L., & Surdeanu, M. (2013). Selectional preferences for semantic role classification. Computational Linguistics, 39(3), 631-663., this is mostly syntactic: "... typically perform SRL in two sequential steps: argument identification and argument classification. Whereas the former is mostly a syntactic recognition task, the latter usually requires semantic knowledge to be taken into account" Predicate identification Pruning Pruning: remove candidates that are clearly not argument of a given predicate to save training time and, more importantly, improve performance (Punyakanok et al, 2008)Punyakanok, V., Roth, D., & Yih, W. (2008). The Importance of Syntactic Parsing and Inference in Semantic Role Labeling. Computational Linguistics, 34(2), 257–287. doi:10.1162/coli.2008.34.2.257 (however, mate tools (Björkelund et al., 2009)Björkelund, A., Hafdell, L., & Nugues, P. (2009). Multilingual Semantic Role Labeling. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task (pp. 43–48). Association for Computational Linguistics. doesn't employ this step). Pruning algorithm for constituent syntactic parse tree (Xue & Palmer, 2004)Xue, N., & Palmer, M. (2004). Calibrating Features for Semantic Role Labeling. Emnlp, 88–94. Retrieved from http://verbs.colorado.edu/~xuen/publications/emnlp04.pdf: * Step 1: Designate the predicate as the current node and collect its sisters (constituents at- tached at the same level as the predicate) unless its sisters are coordinated with the predicate. If a sister is a PP, also collect its immediate children. * Step 2: Reset the current node to its parent and repeat Step 1 till it reaches the top level node. Argument identification Argument classification Global (joint) scoring Reranking "The early work of Gildea and Jurafsky (2002)Gildea, D., & Jurafsky, D. (2002). Automatic labeling of semantic roles. Computational linguistics, 28(3), 245-288. produced a set of possible sequences of labels for the entire sentence by combining the most likely few labels for each constituent. The probabilities produced by the classifiers for individual constituents were combined with a probability for the (unordered) set of roles appearing in the entire sentence, conditioned on the predicate. This reranking step improves performance, but because of the use of frequency-based probabilities, the reranking suffers from the same inability to exploit larger numbers of features as the lattice backoff used for individual role classification."Palmer, M., Gildea, D., & Xue, N. (2010). Semantic Role Labeling. Synthesis Lectures on Human Language Technologies, 3(1), page 44. doi:10.2200/S00239ED1V01Y200912HLT006 Toutanova et al. 2005: log-linear reranking model applied to top N'' solutions. P(L|t,v) = \frac{e^{<\Phi(t,v,L), W>}}{\sum_{j=1}^N e^{<\Phi(t,v,L_j), W>}} Re-ranking of several candidate solutions (Toutanova et al., 2008) (+learning +dependencies –search) Viterbi search Integer linear programming Combine local predictions through ILP to find the best solution according to structural and linguistic constraints (Koomen et al., 2005; Punyakanok et al., 2008) (–learning +dependencies +search) Integrated (global) approaches An early research was done by Zhao et al. (2009)Zhao, H., Chen, W., & Kit, C. (2009, August). Semantic dependency parsing of NomBank and PropBank: An efficient integrated approach via a large-scale feature selection. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1 (pp. 30-39). Association for Computational Linguistics.. James Henderson and Ivan Titov's group put effort on joint, synchronized syntactic-semantic parsing (Henderson et al. 2008Henderson, J., Merlo, P., Musillo, G., & Titov, I. (2008, August). A latent variable model of synchronous parsing for syntactic and semantic dependencies. In Proceedings of the Twelfth Conference on Computational Natural Language Learning (pp. 178-182). Association for Computational Linguistics.; Titov et al. 2009Titov, I., Henderson, J., Merlo, P., & Musillo, G. (2009, July). Online Graph Planarisation for Synchronous Parsing of Semantic and Syntactic Dependencies. In IJCAI (pp. 1562-1567).; Henderson et al. 2013Henderson, J., Merlo, P., Titov, I., & Musillo, G. (2013). Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model. Computational Linguistics, 39(4), 949-998.) Global search integrating joint scoring: Tree CRFs (Cohn & Blunsom, 2005) (+learning +/–dependencies +/–search) CRF over tree structure (Cohn & Blunsom, 2005) Cohn, T., & Blunsom, P. (2005). Semantic Role Labelling with Tree Conditional Random Fields. In Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005) (pp. 169–172). Association for Computational Linguistics., CRF over sequence (Marquez et al., 2005)Màrquez, L., Comas, P., Giménez, J., & Català, N. (2005). Semantic Role Labeling as Sequential Tagging. In Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005) (pp. 193–196). Association for Computational Linguistics. Incremental approaches Choi and Palmer (2011)Choi, J. D., & Palmer, M. (2011, June). Transition-based semantic role labeling using predicate argument clustering. In Proceedings of the ACL 2011 Workshop on Relational Models of Semantics (pp. 37-45). Association for Computational Linguistics. devised an elegant transition-based model but didn't receive much attention. Konstas et al. (2014) thinks that incremental SRL is intrinsically harder and should be viewed as a separate task. They rely on an intricate syntactic parser and build a complicated SRL system... Their evaluation is not compatible with standard evaluation. Uncategorized * Back-off lattice-based relative frequency models (02, Palmer 02) * Decision trees (et al. 03) * Support Vector Machines (et al. 04 et al. 07) * Log-linear models (04et al. 05) * SNoW (et al. 04,05) * AdaBoost, TBL, IBL Features Various features were proposed for SRL which can be divided into broad categories: * Lexical ~: word form, lemma * Morphosyntactical ~: part-of-speech * Positional ~: distance * Syntactic ~: dependency label, valency, constituent/dependency paths * Semantic ~: role, frame Evaluation Metrics Some papers report P, R, F1 on argument identification and argument classification (but not predicate identification and disambiguation).Choi, J. D., & Palmer, M. (2011). Transition-based Semantic Role Labeling Using Predicate Argument Clustering. In Proceedings of the ACL 2011 Workshop on Relational Models of Semantics (pp. 37–45). Stroudsburg, PA, USA: Association for Computational Linguistics. In CoNLL-2005, "for an argument to be correctly recognized, the words spanning the argument as well as its semantic role have to be correct." (Carreras & Màrques 2005)Carreras, X., & Màrques, L. (2005). Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. ''Proceedings of CoNLL-2005, 152–164. "F1 score on the SemEval 2007 task of collectively identifying frame-evoking targets, a disambiguated frame for each target, and the set of role-labeled arguments for each frame."Das, D. (2014). Statistical Models for Frame-Semantic Parsing. Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929-2014), (2007), 26–29. Retrieved from http://www.aclweb.org/anthology/W/W14/W14-3007 See also: Dependency-based SRL evaluation Data and evaluation campaigns Constituent-based # CONLL 2004 and 2005 Dependency-based # CONLL 2008 # CONLL 2009 Coverage gap Available lexical resources represent only a small portion of English. Palmer et al. (2010)Alexis Palmer and Caroline Sporleder. 2010. Evaluating FrameNet-style semantic parsing: the role of coverage gaps in FrameNet. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 928-936. showed that the accuracy of a straight supervised system has an upper bound of approximately 46.8% on full texts. Semi-supervised, unsupervised and crosslingual approaches have been proposed to ease this problem.Ivan Titov. Semantic Role Labeling Tutorial: Part 3 - Semi- , unsupervised and cross-lingual approaches. NAACL 2013 Applications "Shallow semantic analysis based on FrameNet data has been recently utilized across various natural language processing applications with success. These include the generation of meeting summaries (Kleinbauer, 2012), the prediction of stock price movement using (Xie et al., 2013), inducing slots for domain-specific dialog systems (Chen et al., 2013), stance classification in debates (Hasan and Ng, 2013), modeling the clarity of student essays (Persing and Ng, 2013) to name a few. There is strong potential in using frame-semantic structures in other applications such as question answering and machine translation, as demonstrated by prior work using PropBank-style SRL annotations (Shen and Lapata, 2007; Liu and Gildea, 2010)." See also * Semantic role labeling (state-of-the-art) * Applications of distributed representation#Semantic role labeling * List of semantic role labelers * Semantic role induction * Palmer, M., Gildea, D., & Xue, N. (2010). Semantic Role Labeling. * Grounded semantic role labeling External links * Semantic Role Labeling Tutorial at NAACL 2013 * Lluís Màrquez. Semantic Role Labeling - Past, Present and Future. ACL-IJCNLP 2009 References Category:Semantic role labeling